[GLUTEN-11514][VL] Refactor plan execution by adding addIteratorSplits and noMoreSplits methods to the plan execution API#11527
Conversation
a4fee9f to
67c4d9a
Compare
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
f5eadc7 to
f62851b
Compare
|
Run Gluten Clickhouse CI on x86 |
2 similar comments
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
| namespace gluten { | ||
|
|
||
| class CudfVectorStream : public RowVectorStream { | ||
| class CudfVectorStreamBase { |
There was a problem hiding this comment.
RowVectorStream's code is being inlined into CudfVectorStreamBase here.
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
addIteratorSplits and noMoreSplits methods to the plan execution APIaddIteratorSplits and noMoreSplits methods to the plan execution API
| } | ||
| } | ||
|
|
||
| void WholeStageResultIterator::noMoreSplits() { |
There was a problem hiding this comment.
Would this work if calling from different threads? e.g. thread A called addIteratorSplits(), but thread B called on noMoreSplits()
There was a problem hiding this comment.
It's not guaranteed thread-safe. In the Delta PR, the plan is still created and executed in one single thread so no requirement for thread safety on this API now.
| void WholeStageResultIterator::tryAddSplitsToTask() { | ||
| if (noMoreSplits_) { | ||
| void WholeStageResultIterator::addIteratorSplits(const std::vector<std::shared_ptr<ResultIterator>>& inputIterators) { | ||
| // Create IteratorConnectorSplit for each iterator |
There was a problem hiding this comment.
should we also guard for allSplitsAdded = true?
|
Run Gluten Clickhouse CI on x86 |
The change is not supposed to support a concurrent execution model - it's still single-threaded execution. The goal was to add the barrier API, so we can reuse the Velox task in the same thread but for newer inputs. Let me know any remaining issues. cc @rui-mo |
b0a9190 to
b7aeb50
Compare
…atorSplits` and `noMoreSplits` methods to the plan execution API (apache#11527)" This reverts commit 420b704.
…oreSplits on cuDF value stream nodes This fixes apache#11569 with a code style fix in passing.
…oreSplits on cuDF value stream nodes This fixes apache#11569 with a code style fix in passing.
…its on cuDF value stream nodes (#11572)
Split plan execution via
NativePlanEvaluator/ColumnarBatchOutIteratorinto 3 APIs:NativePlanEvaluator#createColumnarBatchOutIterator#addIteratorSplitsColumnarBatchOutIterator#noMoreSplitsThis allows iterator-based splits to be added later than when the task is started. Useful when
NativePlanEvaluatoris used individually, out of the transformer code routine.This is a prerequisite for #11514 and #11419.
Tested by existing tests and #11419